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Abstract 

We consider a single server queueing system with admission control and the possibility 
to switch dynamically between a low and a high service rate, and examine the benefit of 
this service rate flexibility. We formulate a discounted Markov Decision Process model for 
the problem of joint admission and service control, and show that the optimal policy has 
a threshold structure for both controls. Regarding the benefit due to flexibility, we show 
that it is increasing in system congestion, and that its effect on the admission policy is to 
increase the admission threshold. We also derive a simple approximate condition between 
the admission reward and the relative cost of service rate increase, so that the service rate 
flexibility is beneficial. We finally show that the results extend to the expected average 
reward case. 

1 Introduction 

Admission control is a queue management tool that can increase the efficiency of resource uti- 
lization in many service systems. Depending on the particular application, it can be used to 
preserve system capacity for future customers who bring higher profit, to limit the number 
of admitted customers in order to provide a better quality of service to those already in, etc. 
Admission control is often employed indirectly via dynamic pricing, or by price discrimination 
(direct or indirect) among different customer types. However in several situations frequent 
price changes may not be feasible, and in order to contain congestion denying service to certain 
arriving customers may be necessary. 
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On the other hand, in several queueing systems it is possible to alleviate the congestion 
effects by adjusting the service capacity. For example in banks or call centers the number of 
servers may change during the day to follow variations of the arrival rate. The service rate 
may also be varied dynamically when the queue length becomes too long. Increasing (and in 
some cases decreasing) the service rate comes at a cost, but it has the advantage over admission 
control policies that fewer or no customers are turned away. It is thus of interest to explore to 
what extent a flexibility in service capacity can interact with admission control and when it can 
alleviate its effects. 

In this paper we investigate the interaction of service rate flexibility and admission control 
in an M/M/l queueing system. The flexibility is modeled as the option to switch dynamically 
between a low and a high service rate. We analyze the problem of joint admission and service 
rate control by formulating a Markovian Decision Process model to maximize the infinite horizon 
expected discounted profit. 

We also explore the effect of service rate flexibility on the optimal profit and the admission 
thresholds. Specifically, the benefit of the service rate switch option is compared against a 
baseline case where only the low service rate is used and admission control is employed. Thus, 
in principle, the benefit is due both to the existence of a higher service rate as well as the 
flexibility to use it. On the other hand, the higher service rate is not free but comes at a higher 
cost, thus the benefit of using it reflects the tradeoff between serving at a faster rate and paying 
higher operational costs. When we use the term benefit of flexibility we mean exactly how this 
tradeoff manifests itself in the presence of admission control. 

The paper develops a model of dynamic optimization of queueing systems, a large area with 
very ext ensive literature. Bo t h adm issio n and service c ontrol models have been studied thor- 
oughly. IStidham and Weber (|l993h and IWalrandl (|1988l ) survey several dynamic optimization 
models developed for queueing control. 

For single class customers problems, as the one analyzed in this paper, admission control 
makes sense when there is an exog e nous holding cost rate function. A simple model in this 
direction was first presented in iNaor (119691 1. where arriving customers are admitted or not based 
on the observed queue length with the objective to maximize customer's overall (social) benefit 
from receiving a reward after service completion minus a linear increasing holding cost per unit 
time of delay. It is shown that a socially optimal policy admits few er custom e rs tha n those who 
would decide to enter based on an individual optimality criterion. Stidham ( 19851 ) considers a 
GI/M/1 queue under infinite horizon discounted cost, assuming a convex and nondecreasing 
holding cost rate function. It is shown that the optimal policy has a threshold structure if and 
only if the optimal benefit is concave in the number of custo mers in t h e sys tem which in turn 
depends on convexity of the holding cost rate function. As in IStidhaml (jl985h . we also consider 
a convex, nondecreasing holding cost rate which implies a threshold property of the optimal 
admission policy. 

On the other hand in multi-class systems with finite capacity admission control may be 
useful even in the absence of holding costs, because in this case admitting a c ustomer implies the 
possibility of a loss of profit from a future higher class customer. Miller ( 19691 ) considers a system 
with n parallel and identical servers, no waiting room and m customer classes, which contribute 
to the system, di fferent fixed rewards . This model results in a threshold type optimal policy with 
a preferred class. iLippman and Ross analyze the optimal admission rule for a system with 

one server and no waiting room which rece ives offers from c ustom ers a ccording to a joint se rvice 
time and reward probability distribution. ICarrizosa et al.l (|1998l ) and lOrmeci et al.l (120011) also 



i nvest igate properties of optimal admission policies for certain loss systems. ICarrizosa et al 



(1998) develop an optimal static admission policy in an M/G/c/c queueing system with k 
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custom er classes with generally different service requirements and service rewards. lOrmeci et al 



(j200lh examine the problem of dynamic admission control in a two class loss Markovian queueing 
system with different service rates and different fixed rewards for the two customer classes. 

The admission control problem has been also analyzed in queueing systems under heavy traf- 
fic. An often used approach In this framework the dynami c optimization i s usually approximated 
by a diffusio n control problem, follow ing t he approach of [ Harrison dl988h . Recent works in this 
area include Ward and Kumar ( 20081 ) and Kocaga and Ward ( 201C ). both analyzing admission 
control under customer abandonments. IWard and Kumarl (j2008l ) analyze a GI/G/1 queue in 
the balanced heavy traffic regime, where the optimal control depends on the sample path of the 
diffusion and the resulting asymptotically admission contro l policy of threshold type depends 
on second moment data of the interarrival and service times. Kocaga and Ward ( 2010l ) consider 
the same problem in amulti-class environment under discounted expected cost minimization. 

Dynamic service control in queueing systems is an equally large field. Several problems can be 
viewed as service control models, includi ng controlled s erver vacations, server allocation policies 
in polling systems, etc. In an early work Crabill ( 19741 ) examines dynamic service control under 
infinite horizon expected average expected cost in a maintenance system with finite available 
service rates, a linear holding cost rate and a reward collected in service completions. It is shown 
that the optimal service rate is increasing in the num ber of cu s tomer s waiting in line. The 
monotonicity of the optimal service rate is also shown in Lippman ( 19751 ) in the framework of an 
M/M/l queue, with service rates va rying in a closed set and the holding cost rate increasing and 
convex. George and Harrison ( 200ll ) consider the service control problem in an M/M/l queue 
where service rates are dynamically selected from a close subset of [0,oo], under no switching 
cost, state-dependent holding cost and rate dependent service cost. They develop an asymptotic 
method for computing the optimal policy under average cost minimization by solving a sequence 
of approximating problems, each involving a truncation of the holding cost function. They prove 
that the optimal policies of the approximating problems converge monotonically to the optimal 
policy of the original problem and derive an implementable policy and a performance bound 
at each iteration. In our model we also derive a monotonicity property of the service control 
component of the problem, under a convexity assumption on the holding cost function. In the 
w orks mentioned above t here are no switching cos ts for changing the servi c e rate . We refer 
to Lu and Serfozo ( 19841 ). Hipp and Holzbaur ( 19881 ) and Kitaev and Serfozo ( 19991 ) for models 



that include service rate switching costs, resulting in hysteretic policies. 

In the a rea of joint admission and seryice co ntrol, two works related to ours are lAta and Shneorson 

(l200fih and lAdusumilli and Hasenbeinl (120101) . bo t h mo tivated by and extending the work of 
George and Harrison! ( 2001 ). Ata and Shneorson ( 20061 ) consider the joint admission and 



ser- 



vice control problem in an M/M/l queue with adjustable arrival and service rates, under 
long-run average welfare maximization. They also formulate and solve an associated dynamic 
pricing problem. They show that the optimal arrival and service rates are monotone in the 
system length. However the optimal prices, which are set to induce the optimal arrival and 
service rates, are not necessarily monotone. Finally, they find that dyna mic policies can re- 
sult in significantly higher profits c ompa red to static policies. Similarly to lAta and Shneorson 
(120061 1 . lAdusumilli and Hasenbeh] (j2010h develop an efficient iterative method for computing 
the optimal policy under an average cost criterion, providing a computable upper bound on 
the optimality gap at each iteration step. It is also shown that service rates are monotone 
increasing in the system state. Finally, two works which consi der the joint admission a n d ser - 
vice control problem in the h e avy t ra ffic regime are included in Ghosh and Weerasinghd (|2007l ) 
and Ghosh and Weerasinghe ( 2010l ). Ghosh and Weerasinghe ( 20071 ) examine a queueing net- 
work where a central planner dynamically selects the service rate and buffer size that min- 
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imize the long-run average expected cost. The optimal policy is derived from the solution 
of a Brownian control pro blem and it consists of a feedb ack-type drift control and a thresh- 
old type admission policy. Ghosh and Weerasinghe ( 20ld ) consider a Markovian system with 
customer abandonments and a ddress the infinite horizon discounted problem. In contrast to 
Ghosh and Weerasinghe ( 20071 ). it is proved that the optimal joint dynamic policy derived from 



the solution of the Brownian control problem is asymptotically optimal for the original problem. 

In our paper we consider a simpler admission-service control model with two available service 
rates and explore the benefit of the service-rate flexibility. We first show that the optimal policy 
has a threshold structure for both controls. We introduce the value of the service rate control 
as the expected profit increase in the optimal admission control subproblem when the dynamic 
service rate change option becomes available, and show that the value is increasing with the 
initial level of congestion. Furthermore, the effect on the optimal policy is to increase the 
admission threshold. We also derive a simple sufficient condition between the admission reward 
and the service cost so that the service flexibility is worthless. Finally, we extend the results 
in the long-run average expected reward case and show that an optimal average reward policy 
exists and it can be computed as a limit function of a sequence of optimal discount policies 
under a sufficient condition on the holding cost rate function. 

The rest of the paper is organized as follows. In Section [2] we define the joint control model, 
show several properties of the value function and establish the threshold structure of the optimal 
policy. In Section [3] we analyze the value of service flexibility and the effect of the high service 
rate switch option on the admission policy. In Section H] we analyze the problem under the 
average reward criterion. In Section [5] we consider a variation of the original model, in which 
the service reward is collected at departure epochs and show that the main results still hold. In 
Section [6] we present a set of computational experiments exploring the sensitivity in the system 
parameters. Section [7] concludes. 



2 Model Description 

We consider a single server Markovian queue under the FCFS discipline, where customers arrive 
according to a Poisson process with rate A. The service rate may be dynamically set to either 
a low or a high value, /// < fj,h, without any switching cost. The service provider receives a 
fixed reward R > per customer admitted, and incurs holding and service costs as follows. The 
holding cost is equal to h{x) per unit time, where x is the number of customers in the system. 
The function h(x) is assumed to be increasing and convex. The service cost is equal to c; or Ch 
per unit time, for using service rates /i; and respectively, where q < c^. We also assume 
that the low cost rate q is incurred even when the system is empty, because of the presence of 
fixed costs. Let c = — q denote the service overhead paid under high service speed. Without 
loss of generality we normalize q = /i(0) = 0. 

Since the system is Markovian, it suffices to assume that the system manager makes a decision 
at both arrival and departure epochs. Service rate decisions can be made at both arrival and 
departure epochs, whereas admission decisions are made only at arrival epochs. Assuming 
continuous time discounting at rate (3 > 0, the service provider's objective is to maximize the 
infinite horizon expected discounted net profit. Thus, the problem can be framed as a continuous 
time Markov Decision Process, as follows. 

Let Tj be the time of the j^ 1 arrival, X(t) a random variable denoting the number of 
customers in the system at time t and I(t) = l(t = Tj for some j) the indicator of the event 
that t is an arrival epoch. We define the state vector as the pair (X(t),I(t)), thus the state 
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space is S = IN x {0, 1}. State (0,0) denotes an empty system. 

For the action set, let A s (t) € {I, h} denote the service rate employed at time t, where l,h 
stand for service rates Hu^h-, respectively, and A d (Tj) € {0, 1} the admission decision at the 
arrival epoch, where 0,1 denote rejection and admission, respectively. In states (X(t),I(t)) = 
(x, 1) corresponding to arrival epochs, the action is defined by the pair a(t) = (a d (t) , a s (t)) , thus 
the action set is A(x, 1) = {0,1} x {l,h}. On the other hand, in states (X(t),I(t)) = (x,0) 
corresponding to departure epochs, the action is defined only by a s (t), thus the action set is 
A(x,0) = {l,h}. Finally, let II be the space of history dependent policies and v^(x,i) denote 
the infinite horizon expected f3— discounted net profit with initial state (x, i) € S 

vp(x,i) = 

oo 

^e~^R l{A d (Tj) = 1) 

3=0 

oo 

e- pt [h(X(t)) + c(A s (t))}dt\X(0) = x, 1(0) = i 

The optimal value function is 



o 



vp(x,i) = swpvg(x,i), (1) 
Tren 



and a policy tt* is optimal if = vp. 

2.1 An Equivalent Model in Discrete Time 

We can construct a discrete-time version of the Markovian Decision problem as follows. Depend- 
ing on the state and the action employed, the transition rate out of any state can take values A, 
A + fn, or A + /i/j. Let A = A + ji^ denote the maximum transition r ate ou t of any state. Using 
standard uniformization arguments (see Section 11.5 of Puterman ( 19941 )). it follows that the 



model described in (prj is equivalent to a model where transition rates are all equal to A and the 
transition probabilities are appropriately modified. Since in the original continuous time model 
the transition rates out of a state are generally different in different states, the discrete time 
formulation allows for transitions from a state back to itself so that the expected sojourn times 
are equal in the two models. These are referred to as fictictions transitions. 

Via this transformation, the problem can be written in a form equivalent to a discrete time 
discounted Markov Decision Process, as follows 

v(x,l) = max{ J R + v(x + l,0),v(x,0)} , x > (2) 
+ m ax{jv(x,0),-j + jv(x-l,0)}\, x>0 (3) 

w(0 ' 0) = AT^{r (0 ' 1) + f u(0 ' 0) }' (4) 

where 6 = Hh~ l^l an d for simplicity we omit the subscript f3 which is constant throughout. 

Note that in the discrete time formulation, the equivalent discount factor per transition is 
equal to a = ^3 an d for /3 > it has the standard property < a < 1. For ease of the 
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exposition we normalize the time scale so that A + (j = 1. The normalization is without loss of 
generality as long as the discount rate is fixed. In Section [3] where we consider the criterion of 
average reward per unit-time as a limit of the discounted reward problem when (3 — > and thus 
a — > 1, we do not make this normalization assumption. 

The finite horizon version of this last model is the following, where v n {x,i) denotes the 
optimal discounted profit for the remaining n transitions, starting at state (x,i). 



v n+1 {x,l) = max{R + v n+1 (x + l,0),t> n+ i(x,0)}, x > 0, (5) 
v n+1 (x,0) = -h(x) + Xv n (x, 1) +niv n {x - 1,0) 

+ m&x{5v n (x, 0), — c + 5v n (x — 1, 0)}, x > 0, (6) 

u„+i(0,0) = Au n (0,l)+MfcU»(0,0), (7) 

uo(ac,t) = 0, x > 0, t € {0,1}. (8) 



Note that in ([5]) the transition index on the right hand side is still n + 1, because after an 
admission decision in state (x, 1) there is an instantaneous state switch to state (x + 1,0) or 
(x,0), so that the corresponding service-rate decision can also be made at that instant. The 
advantage of writing the optimality equations in this form is that only admission decisions are 
made in states (x, 1) and only service decisions in states (x, 0). 

Since the state space is infinite and the one-step reward function is not necessarily bounded, 
the convergence of (|5])-(j8]) to the optimal value function must be established. 

To this end, we make the following assumption ensuring that the holding cost does not in- 
crease too rapidly with the queue length. 

Assumption 1 

1. There exists a constant 9 > 1 such that: h{x + 1) < 9h(x), for any x > 0. 

2. There exists a constant a € [0, 1) and a positive integer J such that: for x > 0, 

A J [R + c + h(x + J)] < a[R + c + h(x)]. (9) 



Assumption 1 is quite general. It can be easily seen that it is satisfied for power cost functions 
h{x) = Kx m , K > 0, m > 1 as well as exponential cost functions h(x) = Kp x with K > and 

In the next theorem we show that under Assumption 1 there exists an optimal policy for 
the discounted problem and the finite horizon approximations converge to the unique solution 
of ©-(gD. 

Theorem 1 If the holding cost rate function h{x) satisfies Assumption 1, then 

i. The system of equations ©-([H) has a unique solution, which equals v%. 

ii. There exists a stationary deterministic optimal policy. 

Hi. The solution of the system of equations (|5])-([8|) converges to v%. 



Proof. The proof follows by applying Theorem 11.5.3. of lPutermanl ()1994l ). To do this we must 
verify the following 



1. Assumption 11.5.1 (|Putermanl (|1994l )) implies that all transitions rates are bounded above. 
This is satisfied here with A being the upper bound. 
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2. There exists a function w : S — > IR such that 

2a. max a |r(s, a)| < Mw(s), Vs G S 1 , where r(s, a) is the one period profit function in the 
discrete time MDP ©-([8]) and M is a constant. 

2b. There exists a non-negative constant k < oo for which 

E 7r {w(X n+ i)|X n = x,I n = i,Y n = a} < kw(x,i), 

for all a G A(x, i) and (x, i) G S 1 . 
2c. There exists constant a G [0, 1) and J G Z such that 

k J W{w(X n+J ,I n+J )\X n = x,I n = i} < aw(x,i). 

We will verify that the function w(x,i) = w(x) = R + c + /i(x), G S" satisfies the above 

properties. Note that w(x) is increasing in x. 
To show 2a, note that 



r(s,a) 



R — h(x), for s = (x, 1), a = (1, 1) 

R — c — h(x), for s = (x, 1), a = (1, /i) 

— h(x), for s = (rr, i), a = (0, /) or a = I 

— c — h(x), for s = (x, 1), a = (0, h)or a = h 



Thus, 2a holds for M = 1. 

To show 2b, for any (x, i) G S 1 and a G ^4(x, i) the possible transitions are to states with 
ors + l customers. Since w is increasing, 

W{w{X n+l )\X n = x,I n = i, Y n = a}< w(x + 1). 

From condition (cl), w(x + 1) < 9w(x), thus 2b holds for k = 9. 

Finally, for 2c, iterating the above inequality J times we obtain E 7r {w(X n+ j, I n+ j)\X n = 
x, I n = i} < w(x + J). Therefore it suffices to show that A J w(x + J) < aw(x) for some a < 1. 
However the last inequality holds by condition (c2). ■ 

2.2 The Optimal Threshold Policy 

In this subsection, we derive the structure of the optimal policy. Specifically, we show in Theorem 
[2] that both admission and high service rate controls are based on respective thresholds on the 
queue length. Furthermore, in Proposition [T] we derive a sufficient condition between the values 
of the parameters R, c and d, which makes the option to switch to the high service service rate 
be essentially of no value for the service provider. 

Let a^(x) be the optimal admission decision in state (x,l) and a^(x) the optimal service 
rate decision in state (x, 0) when n transitions remain. From the above optimality equations it 
follows that 

d if A n+1 (x,0) <R 



where 



< + i(*)=| ; ifCiMj;* ' forx - ' (10) 

<+i(0) = J, (12) 
A n (x,i) = v n (x,i) - v n (x + (13) 
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denotes the loss in future rewards because of the increased load from an accepted arrival. In 
addition, let Ah(x) = h(x+l) — h(x) be the increase in holding cost rate induced by an additional 
customer. 

In order to characterize the optimal policy, we first present some intermediate properties. 
Lemma Q] shows that the value function is nonincreasing in x. 

Lemma 1 The value function v n (x,i) is nonincreasing in x. 

Proof. We will prove that v n (x,i) is nonincreasing in x for any i = 0, 1, or equivalently that 
A n (£,i) > 0, by induction on re. By ([8]), we obtain Ao(x,i) = and the statement holds for 
re = 0. 

Suppose that v n (x,i) is nonincreasing in x for n. Then, for n + 1, we consider two cases: i = 
and i = l. 

Case I: i = First, for x = 0, by © and flTJ), we obtain: 

A n+1 (0,0) = v n+1 (0,0) -v n+1 {l,0) 

= h(l) + AA n (0, 1) + 6v n (0, 0) - max{fo n (l, 0), -c + 5v n (0, 0)} 
= h(l) + AA„(0, 1) + min{5A n (0, 0), c} > 

from the induction hypothesis. 

For x > 0, the terms of ([U]) are nonincreasing functions in x by the induction hypothesis, 
the assumption that h(x) is increasing in x and the fact that the maximum function of the 
nonincreasing functions 5v n (x, 0) and — c + 6v n (x — 1, 0) in x is nonincreasing in x. 
Therefore v n (x, 0) is nonincreasing in x for any n. 

Case II For i = 1, we obtain similarly that f n +i(rr, 1) is nonincreasing in x by and the result 
proved in Case I. 

Therefore the statement in Lemma [T] holds for re + 1 and the proof is complete. ■ 

The monotonicity of v n (x,i) in x is intuitive. It implies that A n (x,i) is nonnegative, thus it 
can be seen as the burden or profit reduction induced by one additional customer in state (x,i). 

In the next Theorem we show that the optimal policy is characterized by service and ad- 
mission thresholds. We first define a generic threshold-type function that will be used in all the 
results. 

For a function / : No — > IR and 9 G IR define 

T f (0) = sup{k > : f(k) < 9}, (14) 

with the convention sup0 = — 1. 

It is easy to see that Tf(9) has the following properties: 

(a) Tf(8) is non-decreasing in 8 for all non-decreasing functions /. 

(b) If /, g are such that f(k) < g(k) for all k = 0, 1, . . ., then 2/(0) > T g (6) for any 9 G R. 
We now proceed to the Theorem. 

Theorem 2 i. The value function v n (x,i) is concave in x, for i = 0, 1. 
ii. There exist thresholds B^,B^ such that: 

<+i(0) = I, (15) 
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<«<*>={!; *lt% forx>0 (16) 
I: Hist forxi0 (17) 

Q 

B n = 1 + T A n (-,0)(^) 
S n = ^AnC-.O)^)- 

Proof. The concavity of v n (x,i) in state (x, i) is equivalent to A n (x,i) being increasing in x. 
The proof is by induction on n. For n = 0, since fo^j *) = 0, we obtain Ao(x, i) = 0. 

It follows that a\(x) = I for all x > 0, thus (JTSJ and (HBJ hold with 5^ = +oo. 

Furthermore, from ([6]) and ©, Ai(x,0) = A/i(x), x > 0, where Ah(x) is increasing by 
assumption, thus (1171) holds with 1?^ = sup{A; > 1, Ah(k) < R}. 

Now suppose that i. holds for some n. In order to prove the Theorem, it suffices to show 
that ii. holds for n and i. holds for n + 1. To do this we will prove the following facts in 
sequence: 

(a) (fT6l) holds for n. 

(b) A n+ i(x,0) is increasing in x. 

(c) ()17p holds for n. 

(d) A n+ i(x, 1) is increasing in x. 

(a) Let B s n = sup{£; > 1 : A n (k - 1,0) < f }. Note that B s n = 1 + T An( . >0) (f ) 

Since, by the induction hypothesis, A n (x, 0) is increasing in x, it follows from (|lip that 

f / if r < B s 

<»M-{iil>Bi (18) 

(b) Under (a), © is transformed to 

, n x f -/i(a:) + Au n (x,l) + /iiu n (a;-l,0) + 5t; n (x,0), if x < 

n+U ' J 1 -c-/i(x) + Ai; n (x,l) + W t; n (x-l,0) + ( 5i; n (x-l,0), if x > i 

Thus v n +i(x,0) is concave in x (i.e. A n+ i(x,0) < A n+ i(x + 1,0)) for x < B^ — 2 and for 
x > B s n + 1, because of the convexity of the holding cost rate h(x) and the induction hypothesis. 
In order to complete the proof of (b) we must show that 

A n+ i(B* -1,0) < A n+ i(5£,0) < A n+1 (^ + 1,0). (20) 
By (|19p . we obtain the following. 

A n+1 (£*-1,0) = v n+1 (B s n -l,0)-v n+1 (B s n ,0) 

= -h{B s n - 1) + Xv n (B s n -1,1)+ inv n (B' n - 2, 0) + Sv n (B s n - 1, 0) 

+h(B s n ) - Xv n (B s n , 1) - ^v n {B s n - 1, 0) - Sv n (B s n , 0) 
= Ah(B s n - 1) + XA n (B s n -1,1) + ViA n (B s n - 2, 0) + <5A n (^ - 1,0), 

(21) 

A n+1 (5*,0) = v n+1 (B s n ,0)-v n+1 (B s n + l,0) 

= -h(B s n ) + Xv n (B s n , 1) + fiiv n (B s n - 1,0) + <5u„(S^0) 

+c + h{B s n + 1) - Au„(5* + 1,1)- MiVn(J3", 0) - Sv n (B s n , 0) 
= c + A/ l (^) + AA n (^,l)+ w A n (^-l,0), (22) 
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and 

A n+1 (B s n + l,0) = v n+1 (B s n + l,0)-v n+1 (B s n + 2,0) 

= -c - h(B s n + 1) + Xv n (B s n + 1, 1) + fiiv n (B s n , 0) + Sv n {B s n) 0) 

+c + h(B s n + 2) - \v n (B s n + 2, 1) - mv n {B' n + 1,0)- fc n (B* + 1, 0) 
= Ah(B s n + l) + XA n (B s n + l,l)+ fM A n (B s n ,0) + 5A n (B s n ,0). (23) 

From the convexity of h{x) and the induction hypothesis, we obtain: 

Ah(B s n - 1) < A/i(5£) < Ah(B s n + 1), (24) 

A n (B s n - 1, 1) < A n (££, 1) < A n (B s n + 1, 1), (25) 

and 

A n {B s n - 2, 0) < A n (B* -1,0) < A n (B s n , 0). (26) 
By the definition of B", A n (££ - 1, 0) < § and A n ( J B*, 0) > f , thus 

6A n (B s n -l,0)<c<6A n {B s n ,0). (27) 



By inequalities ((2TJ) through ([27j> . (|2D|) is proved. 

(c) Let B£ +1 = sup{& > : A n+1 (k,0) < R}. Note that B d = T An( . i0) (i2). 
Since, from (b) A n _|_i(x,0) is increasing in x it follows from (|10p that 

!fx>|;; < 28 > 

(d) Under (c), optimality equation ([5]) is transformed to 

( R + v n+1 (x + l,0), Hx<B d 
u n+ i(x,l) - | Un+i(a . j0)j ifx>^ +1 

As in (b), by (|29|) we obtain that t> n+ i(a;, 1) is concave in x for x < — 2 and for x > B d +1 + 1, 
because v n+ i(x,0) is concave in x, as we have proved in (b). 

In order to complete the proof we have to consider the cases x = B d +1 — 1 and x = B d +1 , 
thus we need to show 

A n+l {B d n+1 - 1, 1) < A n+l {B d n+1 , 1) < A n+1 {B d n+1 + 1, 1). (30) 

After some algebra we obtain that 

A n+1 (B d n+1 - 1, 1) = v n+1 (B d +1 - 1, 1) - v n+ i(B d +1 , 1) 

= A n+1 (^ +1 ,0), (31) 

A n +i(B d +l ,l) = v n+ i(B d +l , 1) - v n+ i(B d +l + 1, 1) 

= R, (32) 

A n+1 (£# +1 + l,l) = v n+1 (B d +1 + l,l)-v n+1 (B d +1 + 2,l) 

= A n+1 (B d +1 + l,0). (33) 

By the definition of B d +1 , it follows that 

A n+1 (B d +1 , 0) < R < A n+X (B d +1 + 1, 0) , (34) 
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By inequalities (fHTj) - (|Mj) . ([50)1 holds and this completes the proof of the Theorem. ■ 

According to Theorem [2j the optimal action in state (x,l) with n remaining transitions is 
twofold and prescribed by the pair (a d (x), a s n (x)). It is optimal to accept the incoming customer 
if x < B d and reject him otherwise, whereas the optimal service rate for the time interval 
until the next transition is determined immediately after the admission action is taken, and the 
service rate is set to low if x < and to high otherwise. The difference in the subscript 

between the admission and service thresholds is due to the fact that for the admission decision 
with n remaining steps the relevant burden function is A n (-,0) is relevant, while for the service 
rate decision it is A n _i(-,0). On the other hand, in states (x,0) the single optimal action is 
determined solely by a s n (x), the service rate to be employed until the next transition epoch. 

In the remainder of the paper it will be useful to adopt an alternative viewpoint and consider 
the optimal policy of admission/service control as pairs of decisions both taken at departure 
epochs. Specifically let 

a n+1 (x) = (a s n+1 (x),a*(x)j . 

The pair a n+ i{x) can be seen as a decision made at state (x,0) with n + 1 remaining steps, 
prescribing: (i) the service rate to be employed until the next transition epoch and (ii) whether 
to admit a new customer in the event that the next transition is an arrival. Thus, one may 
view the admission/rejection policy as a sign posted at the entrance of the system after every 
departure event. The sign specifies whether new arrivals are welcome to enter the system or 
not. Adopting this view, the pair a n +\{x) specifies which service rate will be employed when 
n + 1 transitions remain, as well as which sign will be posted at that instant. 

The following proposition shows that the service and admission thresholds are ordered in a 
specific way according to the values of parameters R, c and 5. More specifically, we show that, 
when R < |, the availability of the high service rate is of limited value as a profit maximizing 
option. Indeed, in this case, whenever the high service rate is employed in a state x, the rejection 
sign is posted for arrivals at the next decision epoch. 



Proposition 1 // R < | then B d + 1 < B^ and the optimal threshold policy is given by 

a n +i{x) = 



(1,1), 0<x<B« 
(1,0), B d n <x<B s n 
(h,0), x>B s n 



Proof. Suppose R < |. From the definitions of B^, B d and the monotonicity of Tf(0) in 9, it 
follows that 

B s n = 1 + r A „ ( , 0) (|) > 1 + T AM (R) = l + B d n . 

The possible cases for a n+ \(x) follow immediately given the inequality B^ > 1 + B d . m 

Note that if R > |, it can be shown similarly that B d + 1 > B^, but we cannot generally make 
the opposite statement, i.e., that the service rate flexibility is employed beneficially, because in 
this case it can be shown that B d + 1 > B s n , which does not preclude the possibility that 
B d + 1 = B s n , which falls in the case of Proposition [TJ Therefore, R > |, i.e., that the relative 
cost of high service rate is sufficiently low compared to the service revenue, is a necessary but 
not sufficient condition for the service rate switch option to be useful. 

This leads to the question of the value of the high service rate option in general. In the fol- 
lowing section, we explore more thoroughly this issue, by analyzing the value of service flexibility 
as a function of the system state. 
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3 The Value of Service Flexibility 



Motivated by the discussion at the end of the previous section, we next explore how the service 
rate switch option affects, in terms of profit and admission thresholds, the system where only 
admission control is employed at the low rate. In terms of profit, we show that the option to 
increase service capacity becomes more profitable as the system congestion increases, whereas 
in terms of admission thresholds service flexibility ensures the fact that more customers could 
be accepted. 

To assess the value of service rate flexibility, we note that the restriction of the combined 
problem described in ©-([H]), to the class of policies where the service rate is always set to low 
mode, is equiva lent to a pure ad mission control subproblem (for a typical admission control 



formulation see iPutermanl ()1994l ). p. p. 568-571). This restriction corresponds to the following 



set of optimality equations in finite horizon: 

v n+ i(x,l) = max{R + v n+1 (x + l,0),v n+1 (x,0)}, x>0 (35) 

v n+ i(x,0) = -h(x) + Xv n (x, 1) + mv n (x - 1,0) +5v n (x,0), x > (36) 

£ n+ i(0,0) = At) n (0,l)+/i^ n ,(0,0) (37) 

v (x,i) = 0, x>0,ie{0, 1}, (38) 

where v n (x, i) denotes the maximum discounted net profit for the remaining n transitions, when 
the service rate is set to [i\ and admission is dynamically controlled. In the following we will 
refer to the restricted problem as the admission control subproblem. 

Let a^(x, 1) be the optimal decision in state (x, 1). From ([35]) to ([37|) it follows that 

Hi £ (39) 

for x > 0, where 

A n (x, i) = v n (x, i) - v n (x + 1, i) 

denotes the burden in terms of expected profit reduction that an additional customer brings to 
the defined system. 

Similarly to Lemma [T] and Theorem [2j it can be shown that v n (x,i) is nonincreasing and 
concave in x, or equivalently that A n (x,i) is nonnegative and increasing in x. 
Therefore, the optimal admission rule is characterized by admission thresholds 

B n = T A n (-,0)( R ) 

so that a^ +1 (x, 1) = 1 if and only if x < B^ +1 , x > 0, n = 0, 1 , . . . . The t hresh old structure 
of the optimal policy is not new (see e.g. Wahandl (j 19881 ). p.278. IPutern TIr] (jl994l V p.568). We 



restate it here in a notation that allows comparison with the combined problem. 

We cab now define the value of service flexibility as the benefit that the system administrator 
earns from using the service rate switch option, i.e., e n (x, i) = v n (x, i) — v n (x, i). It is immediate 
that e n (x,i) > 0, for all x,i,n. 

In the next theorem, we first prove that the value of service flexibility is nondecreasing in the 
system length x, and thus the option to switch a higher service rate is more useful as the queue 
becomes longer, which is intuitively expected. Moreover, this is equivalent to the fact that the 
burden that an additional customer imposes on the system is lower when the high service rate 
option is available, compared to the pure admission control subproblem. This is also intuitive, 
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since by increasing the service rate, it is possible to alleviate the extra delay because of the 
additional customer. 

Secondly, we show that the admission thresholds are increased when the service rate switch is 
available, thus a customer who would not be accepted in the restricted system may be accepted 
when the system manager has the flexibility to switch to a higher service rate. 

Theorem 3 i. e n (x, i), is nondecreasing in x for all n,i. 

ii. Bi < Bi for all n. 

Proof. Note that i. is equivalent to A n (x, i) < A n (x, i) for any i 6 {0, 1} and n = 0, 1, The 

proof is by induction on n. For n = 0, i. is immediate since £o(x,i) = 0, by initial conditions 
© and (j5gj). 

Now suppose that i. holds for some n. We consider the following cases: 
Case 1: i = 0. For x > by equations (JH|) and ([55]) . we obtain 

e n+1 (x, 0) = v n+ i(x,0) - v n+ i(x,0) 

= \e n (x, 1) + me n {x -1,0)+ 5e n (x, 0) + (5A n (x - 1, 0) - c)+ 

From the induction hypothesis and the fact that A n (x — 1,0) is increasing in x, it follows that 
e n+ i(x,0) is nondecreasing in x. Finally, for x = 0, 

e n+ i(0, 0) = v n+1 {0, 0) - w„+i(0, 0) = Ae n (0, 1) + n h e n {0, 0) 

e n+ i(l,0) = w„ + i(l,0) - n+ i(l,0) 

= Ae n (l, 1) + W e n (0, 0) + Se n (l, 0) + (5A n (0, 0) - c)+ 

From the above equations it follows that e n +i(l, 0) — e n +i(0, 0) > 0, by the induction hypothesis. 

Therefore, e n+ i(x,0) is nondecreasing in x, for x > 0, thus A n+ i(:E, 0) < A n+ i(x,0). By 
property (b) of the generic threshold- type function, Tf(8), we obtain that 

T A n+1 (-,0)( R "> ^ T A n+1 (-,0)(#) #n+l ^ B n+1: 

thus ii. holds for n+1. 

Case 2: i = 1. From the optimality equations ([5]), (|35|) and the property -B^ +1 < 
e n+ i(x, 1) can be written as 

e n+ i(x, 1) = u n+ i(x, 1) -u n+ i(a;, 1) 

r e n+ i(x + l,0), Z<#n+1 
= ^ R + v n+1 (x + l,0)-v n+1 (x,0), B% +1 <x<B% +1 (40) 
I en+i(^,0), x > B% +1 + 1 

To show monotonicity we consider further sub cases for x. 

Case 2a: x < B% +1 . For x < B% +1 - 1 bv (|3U|) we obtain that e n+ i(x, 1) = e n+ i(x + 1, 0), 
which is nondecreasing in x, from Case 1. 

For x = B d +1 we have to prove that e n +i(B d +1 , 1) < e n+ i(B d +1 + 1,1). 

Indeed, 

e n +i(Bi +1 , 1) - e n+1 (B d +1 + 1,1) = 

v n+l {B d n+1 + 1,0) - v n+ i{B d n+1 + 1,0) 
-i? - v n+1 (B? 1+1 + 2, 0) + z) n+ i(^+i + 1, 0) 
= A n+ i(^ +1 + l,0)-i*<0, 
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by definition of B^ +l . 
Case 2b: B% +1 <x< B% +1 . 

For B% +1 + 1< x < B^ +1 - 1, it follows from (0D]) that, 

e n+ i(x, 1) - e n+ i{x + 1,1) = 

R + v n +i (x + 1,0) - v n +i(x,0) 
-R - v n+ i{x + 2, 0) + v n+x (x + 1, 0) 
= A n+ i(x + l,0) - A n+1 (x,0) <0, 

from Case 1. 

For x = B% +1 we must show that i n+1 (B^ +1 , 1) < e n+1 (B^ +1 + 1, 1). 
Again from (|40p we obtain 

e n +i(Bi +l , 1) - e n+ i(B^ +1 + 1, 1) = 

R + Vn+l 

-v n+l {B d n+1 + 1,0) +v n+1 {Bi +1 + 1,0) 
= R-A n+1 (B% +1 ,0) <0, 

by definition of B^ +1 . 
Case 2c: x > B% +1 + 1. 

For x > B^ +1 + 1 the monotonicity of e n +i(x, 1) it is immediate by fiOj) . 

Thus, we have shown the monotonicity of e n+ i(x, 1) in x for the case B^ +l < B^ +1 . It 
remains to examine the case B% +1 = B* +1 . Then the middle range in (140 p disappears and it is 
left to show that, for x = B^ +1 : i n+ i(B^ +1 + 1, 1) > e n+1 (B^ +1 , 1). 

Once again by (|40p we obtain 

e n +i(Bn+i + 1, 1) - e n+ i(B% + i, 1) = i n+ i(B^ +l + 1,0) - e n+ i(B^ +l + 1,0) =0. 

Therefore e n+ i(x, 1) is nondecreasing in x. ■ 

Now that the properties of e n (x,i) have been shown for the finite horizon version of the 
problem, it is natural to ask how the results of Proposition [T] are related to Theorem [3j In 
particular, one might conjecture that if R < |, then e n (x,0) = and B d = B d . However this 
may not be generally true for the following reason. If R < |, then the high service rate is not 
used in states where customers are admitted. However it may still be used in states with large 
x although new arrivals are rejected, in order to empty the queue faster and reduce the holding 
costs. Therefore e n (x,0) could still be positive in such states. Furthermore, even for states 
with x small enough so that the optimal service rate is low in the combined problem, i.e., for 
x < B% it is not clear that e n (x,0) = 0. If the thresholds could be shown to be monotone with 
respect to the number of periods n, then the above could be shown by induction, however this 
monotonicity may not be true in general. 

On the other hand, it has been shown in Theorem [1] that, under fairly general conditions on 
the holding cost function h(x), the finite horizon problems converge as n — > oo, to the infinite 
horizon problem, for which the optimal policy is stationary. For this limiting problem it is 
possible to prove an interesting relationship between Proposition [T] and Theorem El as we do 
next. 
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3.1 Service Rate Flexibility Under Infinite Horizon 

In this subsection we show that under the sufficient condition R < | stated in Proposition 
(H the value of service flexibility is essentially of no value in low congestion states and the 
optimal admission thresholds are the same with and without the service rate switch option in 
the framework of the infinite horizon discounted problem. 

Consider the infinite horizon problem and assume that the holding cost function satisfies the 
conditions of Theorem [TJ It follows that the value functions of the combined and the admission 
control subproblems converge to their infinite horizon counterparts v(-, •),£>(•,•), which retain 
the monotonicity and concavity properties proved for finite n. Thus, the infinite horizon optimal 
policies are still threshold-based with time stationary thresholds, i.e., there exist B s , B d , B d such 
that the optimal policy a = a(B s , B d ) for the combined problem is 

a s (x,i) = < ' 7 os , a a (x,l) = < ' - nd 
v ' \ h, x > B s v ' ' \ 0, x > B d 

and the optimal policy a = a(B d ) for the admission control subproblem is 

»d, ,n ( I, x<B d 

^(X, 1) = < r-\rt ' 

v ; \ 0, x > B d 

Furthermore, the service rate flexibility e(x, i) = v(x, i) — v(x, i) is increasing in x and B d < B d . 

We thus restrict attention to the class of stationary threshold-type policies. Let ir = ir(b s , b d ) 
be any (not necessarily optimal) stationary threshold policy for the combined problem, prescrib- 
ing actions tt s (x, i) = I if and only if x < b s , and Ti d (x, i) = 1 if and only if x < b d . A threshold 
policy 7r = Tt(b d ) for the admission control subproblem can be defined similarly. Finally let v w 
denote the infinite horizon discounted profit function for the combined problem under thresh- 
old policy 7r, and v# the corresponding function for the admission control subproblem under 
threshold policy it. 

Now consider a threshold policy ir(b s , b d ) with thresholds b s > b d + 1 and the corresponding 
policy 7f(b d ) for the admission control subproblem with the same admission threshold. The 
following lemma shows that under these two policies the value functions of the two control 
problems coincide for all reachable states x where the low service rate is used. 

Lemma 2 For any b s , b d such that b s > b d + 1, and policies it = 7r(6 s , tt = Tr(b d ) 

v n (x, i) = v#(x, i), x = 0, 1, ... , b s , i = 0, 1. 

Proof. For the infinite horizon discounted profit maximization problem the value function 
corresponding to stationary policy tt can be found as the unique solution to a system of linear 
equations corresponding to the policy evaluation step of the policy iteration method. Specifically, 
for policy n(b s ,b d ) with b s > b d , the policy evaluation equations are 

^(0,0) = A^(0,l)+/^(0,0) 

v 7r (x,0) = — h(x) + \v n (x, 1) + niv n (x — 1, 0) + Sv 7T (x, 0), x = 1, . . . , b s 

v n (x,l) = R + v n (x + l,0), x = 0,...,b d 

v n (x,l) = v n (x,Q), x = b d + l,...,b s 

v w (x,0) = — h{x) — c + \Vtt{Xi 1) + HhV^x — 1, 0), x > b s 

v n (x,l) = v n (x,0), x > b s 
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In the above system, the 2{b s + 1) values v w (x, i), x = 0, . . . , b s , i = 0, 1 are the unique solution 
to the first 2(b s + 1) equations for i = 0, 1 and x = 0, . . . , b s . It is also easy to see that each of the 
remaining quantities v 1T (x, i) for i = 0, 1 and x > b s can be obtained recursively as a function of 
u 7r (6 s ,0), from the remaining equations. 

Similarly, for policy Tr(b d ) the policy evaluation equations of the admission control subprob- 
lem are 

0*(0,0) = Afi#(0,l)+/i fc 0*(0,0) 

Vft(x,0) = — h(x) + \i)ft(x, 1) + HiVft(x — 1, 0) + 5v^(x, 0), x = 1, . . . , 6 s 
v%{x,l) = R + v#(x + l,Q), x = 0,...,b d 
v${x,l) = 0ft(x,0), x = b d + 1,. . . ,b s 

Vft(x,0) = — h(x) + \v%(x, 1) + MVft(x — 1, 0) + 8v%{x, 0), x > b s 
v-rc(x,l) = i).fr(x,0), x > b s 

and the 2(6 S + 1) values Vft(x,i),x = 0, . . . , 6 s , i = 0, 1 are the unique solution to the first 2(b s + l) 
equations for i = 0, 1 and x = 0, . . . , b s . 

We finally note that the first 2(b s + 1) equations are identical in the two problems above, 
therefore v n (x, i) = Vft(x, i), for x = 0, 1, . . . , 6 s , i = 0, 1. ■ 

In the next proposition we make use of Lemma[2]to show that if the optimal policy a(B s , B d ) 
for the combined problem is such that B s > B d + 1, then the service rate flexibility is equal to 
zero for states with x < B s . Furthermore, the optimal admission threshold for the admission 
control subproblem is equal to that for the combined problem. 

Proposition 2 If B s > B d + 1, then B d = B d , and e(x, i) = 0, i = 0, 1, x = 0, 1, . . . , B s . 

Proof. We have shown that the optimal admission thresholds generally satisfy B d < B d . 
Assume that the optimal policy for the combined problem a(B s ,B d ) satisfies B s > B d + 1. 

Consider the policy tt = Tr(B d ) for the admission control subproblem that applies admission 
threshold B d . From Lemma [2] it follows that v a (x,i) = v^(x,i), for x = 0, ...,B s ,i = 0,1. 
However, v(x, i) = v a (x,i), v#(x,i) < v(x, i) < v(x,i) for all (x,i). It follows that v(x,i) = 
v(x, i), thus e(x, i) = 0, for x = 0, . . . , B s ,i = 0, 1. 

Now suppose that the optimal policy a{B d ) for the admission control subproblem is such 
that B d < B d and consider state (B d + 1,1). By the definition of B d it follows that admitting 
a customer in this state is strictly suboptimal for the admission control subproblem, i.e., 

R + v(B d + 2, 0) < v{B d + 1,0). 

On the other hand, for the combined problem admitting the customer in this state is optimal, 
i.e., 

R + v(B d + 2, 0) > v(B d + 1,0). 

However, B d + 2 < B d + 1 < B s , thus, as we have shown above, v(B d + 2, 0) = v(B d + 2, 0) and 
v(B d + 1,0) = v(B d + 1,0). Therefore the two inequalities above lead to a contradiction and 
we conclude that B d = B d . m 

We can now show that, for the infinite horizon case, Proposition [1] complements Theorem [31 
in the sense that R < | actually implies that adding the service rate switch possibility does not 
affect the admission threshold, and the value of flexibility is equal to zero for states with low 
congestion. This result is an immediate consequence of Propositions Q] and [2j 

Theorem 4 If R < f, then B d = B d , and e(x,i) =0, i = 0,l, x = 0, 1, . . . , B s . 
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4 The Average Reward Case 



In this section, we consider the objective of expected average reward per unit time, and provide a 
sufficient condition under which a long-run average reward optimal policy exists and is obtained 
as a limit of the discounted reward problems as the discount rate f3 J, 0. This implies that 
that the results on the structure of the optimal policy and the value of service rate flexibility 
presented in the previous sections carry over to the average reward case. In this section we do 
not make the assumption that A + (3 = 1, since we consider sequences of values of /3, keeping 
the remaining parameters fixed. 

For each policy n G II, the long-run average expected net profit given that the initial state 
is (x, i) 6 S is 



g 7T (x,i) = lim sup -vf(x,i), (x,i) € S, 

t— >oo t 



(41) 



where 



vj{x,i) 



Nt 



Y J R^{A\T j ) = l) 

3=0 
t 

[h(X(u)) + c(A s (u))]du\X(0) = x, 1(0) = i 



denotes the expected net profit generated by the process (X(t),I(t)) up to time t with initial 
state (x, i) E S and Nt the number of admission decisions made up to time t. The optimal 
average expected net profit is defined as 



g*(x,i) = sup g n (x,i), (x,i) € S. 
wen 



(42) 



A policy 7r* is characterized as average-reward optimal if g n (x, i) = g*(x, i) for all (x, i) £ S. 

As in Section 12. 1\ we transform the problem into an equivalent model in discrete time using 
uniformization, where the expected time between decision epochs is equal to The resulting 
discrete time Markov decision process is described by the following average reward optimality 
inequalities 



w(x,l) < max {R + w(x + 1, 0), w(x, 0)} , x > 
h(x) 



w(x,0) < 



w{0,0) < 



A 



|- + jw(x, 1) + j-w(x - 1, 0) 



r 5 c 5 

+ max{— w(x, 0), — — + -rw(x — 1, 0)}, x > 

f + ^(0,l) + ^(0,0) 



(43) 

(44) 
(45) 



with respect to a constant g and a real- valued function w on S. These correspond to the discrete- 
time discounted optimality equations ©-([H]). Note that as the continuous discount rate /3 J, 0, 
the equivalent discret e-time di s count factor t 1. 



I 1 \ 1 

Theorem 7.2.3. of iSennottl (|1998l ). provides a set of sufficient conditions (SEN assumptions, 
p. 135) so that: (a) a solution (w,g) to (|4"3|) - (|4"5|) exists, (b) a long run average reward optimal 
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policy exists, which realizes the maximum in (|43p -(|45 p and is obtained as the limit of a sequence 
of discounted optimal policies under a subsequence of discount rates j3 n \, 0, (c) g is equal to 
the optimal average net profit and is obtained as a limit of the optimal discounted expected net 
profit for j3 4- and (d) w(x,i) is a limit function of the sequence wp n = vp n (x,i) — vp n (0,0) 
with (3 n I 0. 

In the following theorem we prove that Assumption 1, which was shown in Theorem Q] to be 
sufficient for the existence of a solution to the discounted problem, also ensures that the average 
reward problem has an optimal solution. 

Theorem 5 If the holding cost rate function satisfies Assumption 1, then 

(i) There exists a stationary long-run average reward optimal policy ir*, which is a limit point 
of a sequence of stationary discounted expected net profit optimal policies, i.e., 

7T* = lim TT^, 
n— >oo 

where {f3 n ,n > 1} is any sequence of discount rates such that f3 n I and ixp n is a fin- 
discount optimal stationary policy. 

(ii) The expected average net profit associated with tt* is equal to 

g* = ]im/3vp(x,i), (46) 

for every (x,i) € S. 

(Hi) For any sequence (3 n 10 in (i), the sequence of functions {wp n ,n > 1} defined by 

Wp n 0, *) = ^/?n ( X , i) ~ V Pn (0, 0) 

converges pointwise to a function w such that (w,g*) satisfy (|43p -(|45 p . 



Proof. From Theorem 7.2.3. of Sennott ( 19981 ) it is sufficient to verify the SEN assumptions. 



In our problem the state space S is countable and from Theorem Q] and Lemma [T] the value 
function v(x,i) of the discounted problem is non increasing in x € INo for any i € {0, 1}. Thus, 
we can apply Corollary 7.5.4 of Sennott ( 19981 ). which states that a sufficient condition for 



SEN assumptions to hold in this case is the existence of a 0-standard policy, i.e. a (generally 
randomized) policy d which induces an irreducible and positive recurrent Markov process with 
finite expected first passage time from any state s to state 0, m s o < oo, s € 5 and expected first 
passage profit from any state s to state 0, w s o > — oo, x > 0. 

To show the existence of a 0-standard policy, for any p € [0, 1] let d(p) be the randomized 
policy under which the service rate is always set to /i/ and arriving customers are admitted 
with probability p. If p is such that ^ < 1, then under policy d(p) the system is equivalent 
to a stable M/M/l queue {Xt,t > 0}, with state Xt denoting the number of customers in the 
system, arrival rate Xp, service rate m admission reward R and cost rate h(x) while in state x. 

The discrete-time equivalent of this process corresponds to a positive recurrent Markov chain 
{X n , n = 0, 1, . . .}, with transition probabilities 

p -^l p -^p - i Xp ± M y > n 

r x,x+l — ^ 1 r x,x—l — ^ ) r x,x — 1 , X ^ u, 

Poi = 1 — -Poo = 
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and one-step rewards r(x) = i^EzMf). 

Let N = minjra > : X n = 0} denote the first passage time to state 0. It is well known 
that the expected first passage times to state in the continuous-time M/M/l queue are equal 
to — ^r— , x > 0, thus in the discrete time model 

Ax 

m x0 = E(N\X = x) = — < oo. 

M - Ap 

Now consider the first passage expected profit 



w 



:0 = E [J2 r ( X n)\ X(P) =A= ^™ x0 - jE I X(0) = x) 

\n=0 J \n=0 J 



Thus, to show that w x q > — oo, it suffices to show that the expected first passage holding 
cost is finite, i.e., 



H x = E(j2 h ( X n)\ X(0) = x\ 

Vn=0 J 



< OO, X > 0. 



Prom Corollary C.2.4 of Sennottl (jl998l ) it follows that in order to show H x < oo, it is 
sufficient to establish that there exists a nonnegative finite function W(x) such that 

Y^PojW(j) < oo, (47) 

3 

Y J Px j {W{x)-W{j)) > h(x), x>0. (48) 

3 

We will prove that there exist a sufficiently small p and a sufficiently large M > such that 
these inequalities are satisfied by function W(x) = M9 X , where 9 > 1 is the constant appearing 
in Assumption 1. First, (|47j) is immediate, since ^2, - PojW(j) = For x > 0, 

J2 p xj(W(x) - W(j))^M(6 x - 9 X+1 ) + ^M(0 X - e^ 1 ) = M6 x ~ l (9 - Xp ° . 

j 

On the other hand, from Assumption 1 (i), it follows that h(x) < h(l)9 x , x > 0. Therefore, if 
we take p < min(^, 1), so that /i/ — Xp > and M > ^^xplfe-i) 1 then it is true that 

Y,Pxj(w(x) - w(j)) > h^e*- 1 > h(x), x>o. 

3 

Summarizing, we have shown that for sufficiently small p there exists a function W(x) sat- 
isfying (I47|) . (I48[) . thus policy d{p) is standard and the proof of the theorem is complete. 
■ 

From Corollary 7.5.4 of Sennott ( 19981 ). we also obtain that the relative value function w(x, i) 
is nonnegative and nonincreasing in x for any i. 

The results on the structure of the discounted optimal policy are extended to the average 
reward case, since the average optimal policy is obtained as a limit of a sequence of discounted 
optimal policies. 

Specifically, there exist admission and service thresholds as in Theorem [2] and if R < % the 
thresholds are ordered as in Proposition [TJ and in this case the service rate flexibility is of no 
value. 
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5 Reward Collected at Departure Epochs 



In the previous sections it was assumed that the reward R is collected upon admitting a customer. 
While this is plausible in many situations such as ticket-based operations or call centers with 
upfront charge, it is also often the case that the service reward is collected at the time of 
departure of the customer, for example in jobshops with payment upon delivery. In this section 
we formulate the corresponding MDP model for the second case, point out the similarities and 
differences and show that essentially all the conclusions obtained so far still hold. 

When R is collected at departure epochs, the Markov Decision Process in finite horizon 
corresponding to ©-([8]) now takes the following form: 

max{v n+1 (x + l,0),v n+1 (x,0)}, x > 0, (49) 
-h(x) + Xv n (x, 1) + m{R + v n (x - 1, 0)) 

+ max{5v„(x,0),-c + 6(R + v n (x-l,0))}, x > 0, (50) 

Xv n (0,l)+fi h v n (0,0), (51) 

0, x > 0, i € {0,1}. (52) 

Note that in (|49p -(j52p the term for R is added at transitions from (x,0) to (x — 1,0), whereas 
at admission epochs there is no reward collected. 

As in the original model, let A n (x,i) = v n (x,i) — v n (x + 

One and, in essense, the only difference between the two models is that Lemma [T] is not true 
anymore, i.e., the value function is not nonincreasing in x. This is intuitively expected, since an 
additional customer in the queue brings with him the prospect of a future reward as well as a 
burden due to the higher holding costs. Mathematically, the induction proof of Lemma [T] breaks 
down at state (0,0). Indeed, it is now true that 

A n+1 (0, 0) = h(l) - mR + AA n (0, 1) + min{5A n (0, 0), c - 5R}, 

which is not necessarily nonnegative due to terms —fiiR and —SR. 

Therefore, it is not generally true anymore that A n (x,i) > 0, thus it cannot be interpreted 
as a burden, but rather as the net effect of an additional customer, which can be either a burden 
or a benefit. 

On the other hand, by following the remaining proofs in the original model, it can be verified 
that all the results on the monotonicity of A n (x,0) in x, the threshold structure of the optimal 
policy and the properties of the value of flexibility function still hold. The admission and service 
rate thresholds now take the form 

B s n = l + T An{ . ]0 )(- - R), B% = T An (. 0) (0), 

where A n corresponds to the new net effect function and Tf(9) is the same generic threshold 
function defined in (fl4|) . 

6 Computational Results 

In this section we present the results of some computational experiments, which explore the 
value of service rate flexibility and the influence of the service rate switch option on the optimal 
admission policy. In the previous sections it was established analytically that when R < | 
the service rate flexibility is essentially of no value. This is so because from Proposition [21 



v n+1 (x,l) = 

v n+1 (x,0) = 

v n+1 (0,0) = 

v (x,i) = 
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B s > B d + 1, thus either the system stops accepting customers before the higher service rate is 
used, or the higher service rate is employed at the last state before the rejection sign is posted. 
Theorem U] further implies that in this case the optimal admission thresholds are the same with 
and without the service rate switch option. However, even if R > |, it may still be true that 
B s > B d + 1 and thus B d = B d . 

Our first numerical experiment demonstrates that, although the relationship between R and 
| by itself does not uniquely characterize the value of service rate flexibility, it provides a good 
approximation for such a characterization. To do so, we consider a system with A = 5, m = 
3, fih = 5, c = 6, h(x) = x 2 , (3 = 0.5 and varying value of R. 

Figure [T] shows the relative value of the service rate flexibility as a fraction of the total profit 
of the combined problem at state (0,0) (panel (a)). We observe that the relative value of the 
service flexibility is increasing in R, when R > |. Panel (b) shows the service and admission 
thresholds for the combined and the admission problem. The service threshold is decreasing 
in R, while both admission thresholds B d and B d are increasing. Figure Q] also verifies that 
when R < |, it is true that e(0,0) = 0, B s = B d + 1, and B d = B d . However these are 
still true for values of R slightly greater than ^. Specifically, there exists a critical value R 
such that B s > B d + 1, when R < R and B s < B d otherwise, and this value R is greater 
than |. To examine this issue further, we perform a more general numerical experiment, where 
A, /i/, Hh, h(x), f3 are the same as before and we vary parameters R and c. Figure [2] presents the 
value of R for varying values of |. The curve of R is indeed above the diagonal R = |, however 
only slightly. This shows that the condition R > | is a good approximation to characterize the 
value of service flexibility. In all numerical cases we analyzed we observed similar behavior, i.e., 
there exists a critical value of R above which it is true that B s > B d , and this critical value is 
higher but close to c/5. 

In the third numerical experiment we explore the sensitivity of the service flexibility with 
respect to the relative increase — in service capacity, offered by introducing the high service 
rate option. This effect is presented for two values of the traffic intensity, A, one low, A = 2, and 
one high, A = 20. To do so, we consider a system with m = 3, R = 4, c = 8, h(x) = x 2 , (5 = 1 
and varying value of the high service rate fj.^. 

In Table [U we present numerical results for the relative value of the service flexibility as a 
fraction of the total profit of the combined problem at state (0, 0) and for the values of service 
and admission thresholds of the combined and the admission problem, respectively, for the 
two values of the traffic intensity. We observe that the value of flexibility and the admission 
thresholds are increasing in where on the other hand, the service threshold is decreasing. We 
also note that, as expected, when 5 < the flexibility is essentially of no value and B s = B d + 1. 
Finally, the effect of the higher traffic rate is more pronounced in the value of flexibility than in 
the threshold values. 

Note that in all numerical experiments the parameters do not generally satisfy the normal- 
ization assumption, A + (3 = 1. However, the required rescaling has been performed in the 
computations. 

7 Conclusions and Extensions 

In this paper we analyzed the problem of joint dynamic admission and service control in an 
M/M/l queue under expected discounted profit maximization. We established a threshold 
structure for the optimal service rate-admission control policy. We defined the value of the 
service rate flexibility as the benefit that the higher service rate option brings to a system 
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Figure 1: The value of service flexibility and the admission and service threshold as a function 



of R for A = 5, fii = 3, /i^ = 5, c = 6, 
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Figure 2: Critical value R as a function of c/5 for A = 5, fii = 3, fih = 5, h(x) = x 2 , (3 = 0.5 
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Table 1: Relative value of service flexibility and thresholds of the combined problem and the 
admission control subproblem as a function of 8/ni, for fj,i = 3, R = 4, c = 8, = x 2 , (3=1 
and A € {2,20}. 
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with pure admission control, and showed that the value of flexibility is increasing with system 
congestion. We finally identified a simple condition between the admission reward and the 
relative cost of high service rate, under which the admission policy is not affected and the value 
of service flexibility is zero in low congestion states. 

The analysis in this paper was conducted in the context of an M/M/l queue, where the high 
service rate means that the server operates with higher speed. However in many real applica- 
tions such as banks or call centers the service capacity is affected by dynamically varying the 
number of operating servers. This corresponds to an M/M/m queue with m being determined 
by a dynamic policy. We conjecture that the admission policy will still be threshold-based. 
On the other hand, it would be interesting to study how the service rate policy is structured 
under different assumptions on the service rate switch option, e.g. assuming that all servers are 
required to use the same service rate, or that each server is allowed to select its own rate. Fur- 
thermore, the assumption of identical customers may be relaxed by assuming multiple customer 
classes differentiated by admission reward and/or service rate. If the service rates are identical 
among classes, it is expected that the optimal admission policy is determined by class-dependent 
admission thresholds which are increased when the high service rate option becomes available. 
It would also be interesting to consider the value of service rate flexibility when there is a fixed 
service rate switch cost, which brings hysteretic policies into play. Finally, in the present model 
we considered maximization of net profit from the point of view of a single decision maker, who 
may be the system owner or a collective representative maximizing the total customer bene- 
fit. It would be interesting to consider equilibrium strategies in a game theoretic model where 
the server determines the service rate and arriving customers respond by deciding individually 
whether to join the queue or balk. Such a model could also include pricing as an additional 
policy component available to the server. The extension to equilibrium models with dynamic 
service control is the object of our current research work. 
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